Goto

Collaborating Authors

 persistent diagram


Topological Spatial Graph Coarsening

Calissano, Anna, Lasalle, Etienne

arXiv.org Machine Learning

Spatial graphs are particular graphs for which the nodes are localized in space (e.g., public transport network, molecules, branching biological structures). In this work, we consider the problem of spatial graph reduction, that aims to find a smaller spatial graph (i.e., with less nodes) with the same overall structure as the initial one. In this context, performing the graph reduction while preserving the main topological features of the initial graph is particularly relevant, due to the additional spatial information. Thus, we propose a topological spatial graph coarsening approach based on a new framework that finds a trade-off between the graph reduction and the preservation of the topological characteristics. The coarsening is realized by collapsing short edges. In order to capture the topological information required to calibrate the reduction level, we adapt the construction of classical topological descriptors made for point clouds (the so-called persistent diagrams) to spatial graphs. This construction relies on the introduction of a new filtration called triangle-aware graph filtration. Our coarsening approach is parameter-free and we prove that it is equivariant under rotations, translations and scaling of the initial spatial graph. We evaluate the performances of our method on synthetic and real spatial graphs, and show that it significantly reduces the graph sizes while preserving the relevant topological information.


Reproduction of IVFS algorithm for high-dimensional topology preservation feature selection

Wang, Zihan

arXiv.org Artificial Intelligence

Feature selection is a crucial technique for handling high-dimensional data. In unsupervised scenarios, many popular algorithms focus on preserving the original data structure. In this paper, we reproduce the IVFS algorithm introduced in AAAI 2020, which is inspired by the random subset method and preserves data similarity by maintaining topological structure. We systematically organize the mathematical foundations of IVFS and validate its effectiveness through numerical experiments similar to those in the original paper. The results demonstrate that IVFS outperforms SPEC and MCFS on most datasets, although issues with its convergence and stability persist.


A Vectorization Method Induced By Maximal Margin Classification For Persistent Diagrams

Wu, An, Pan, Yu, Zhou, Fuqi, Yan, Jinghui, Liu, Chuanlu

arXiv.org Artificial Intelligence

Persistent homology is an effective method for extracting topological information, represented as persistent diagrams, of spatial structure data. Hence it is well-suited for the study of protein structures. Attempts to incorporate Persistent homology in machine learning methods of protein function prediction have resulted in several techniques for vectorizing persistent diagrams. However, current vectorization methods are excessively artificial and cannot ensure the effective utilization of information or the rationality of the methods. To address this problem, we propose a more geometrical vectorization method of persistent diagrams based on maximal margin classification for Banach space, and additionaly propose a framework that utilizes topological data analysis to identify proteins with specific functions. We evaluated our vectorization method using a binary classification task on proteins and compared it with the statistical methods that exhibit the best performance among thirteen commonly used vectorization methods. The experimental results indicate that our approach surpasses the statistical methods in both robustness and precision.


Persistent Homological State-Space Estimation of Functional Human Brain Networks at Rest

Chung, Moo K., Huang, Shih-Gu, Carroll, Ian C., Calhoun, Vince D., Goldsmith, H. Hill

arXiv.org Artificial Intelligence

The paper introduces a new data-driven topological data analysis (TDA) method for studying dynamically changing human functional brain networks obtained from the resting-state functional magnetic resonance imaging (rs-fMRI). Leveraging persistent homology, a multiscale topological approach, we present a framework that incorporates the temporal dimension of brain network data. This allows for a more robust estimation of the topological features of dynamic brain networks. The method employs the Wasserstein distance to measure the topological differences between networks and demonstrates greater efficiency and performance than the commonly used -means clustering in defining the state spaces of dynamic brain networks. Our method maintains robust performance across different scales and is especially suited for dynamic brain networks. In addition to the methodological advancement, the paper applies the proposed technique to analyze the heritability of overall brain network topology using a twin study design. The study investigates whether the dynamic pattern of brain networks is a genetically influenced trait, an area previously underexplored. By examining the state change patterns in twin brain networks, we make significant strides in understanding the genetic factors underlying dynamic brain network features. Furthermore, the paper makes its method accessible by providing MATLAB codes, contributing to reproducibility and broader application.


Shape-Preserving Dimensionality Reduction : An Algorithm and Measures of Topological Equivalence

Yu, Byeongsu, You, Kisung

arXiv.org Machine Learning

We introduce a linear dimensionality reduction technique preserving topological features via persistent homology. The method is designed to find linear projection $L$ which preserves the persistent diagram of a point cloud $\mathbb{X}$ via simulated annealing. The projection $L$ induces a set of canonical simplicial maps from the Rips (or \v{C}ech) filtration of $\mathbb{X}$ to that of $L\mathbb{X}$. In addition to the distance between persistent diagrams, the projection induces a map between filtrations, called filtration homomorphism. Using the filtration homomorphism, one can measure the difference between shapes of two filtrations directly comparing simplicial complexes with respect to quasi-isomorphism $\mu_{\operatorname{quasi-iso}}$ or strong homotopy equivalence $\mu_{\operatorname{equiv}}$. These $\mu_{\operatorname{quasi-iso}}$ and $\mu_{\operatorname{equiv}}$ measures how much portion of corresponding simplicial complexes is quasi-isomorphic or homotopy equivalence respectively. We validate the effectiveness of our framework with simple examples.


Topological Data Analysis of COVID-19 Virus Spike Proteins

Chung, Moo K., Ombao, Hernando

arXiv.org Machine Learning

Topological data analysis, including persistent homology, has undergone significant development in recent years. However, one outstanding challenge is to build a coherent statistical inference procedure on persistent diagrams. The paired dependent data structure, as birth and death in persistent diagrams, adds additional complexity to the development. In this paper, we present a new lattice path representation for persistent diagrams. A new exact statistical inference procedure is developed for lattice paths via combinatorial enumerations. The proposed lattice path method is applied to the topological characterization of the protein structures of COVID-19 viruse. We demonstrate that there are topological changes during the conformation change of spike proteins that are needed to initiate the infection of host cells.


IVFS: Simple and Efficient Feature Selection for High Dimensional Topology Preservation

Li, Xiaoyun, Wu, Chengxi, Li, Ping

arXiv.org Machine Learning

Feature selection is an important tool to deal with high dimensional data. In unsupervised case, many popular algorithms aim at maintaining the structure of the original data. In this paper, we propose a simple and effective feature selection algorithm to enhance sample similarity preservation through a new perspective, topology preservation, which is represented by persistent diagrams from the context of computational topology. This method is designed upon a unified feature selection framework called IVFS, which is inspired by random subset method. The scheme is flexible and can handle cases where the problem is analytically intractable. The proposed algorithm is able to well preserve the pairwise distances, as well as topological patterns, of the full data. We demonstrate that our algorithm can provide satisfactory performance under a sharp sub-sampling rate, which supports efficient implementation of our proposed method to large scale datasets. Extensive experiments validate the effectiveness of the proposed feature selection scheme.


Topology Distance: A Topology-Based Approach For Evaluating Generative Adversarial Networks

Horak, Danijela, Yu, Simiao, Salimi-Khorshidi, Gholamreza

arXiv.org Machine Learning

Automatic evaluation of the goodness of Generative Adversarial Networks (GANs) has been a challenge for the field of machine learning. In this work, we propose a distance complementary to existing measures: Topology Distance (TD), the main idea behind which is to compare the geometric and topological features of the latent manifold of real data with those of generated data. More specifically, we build Vietoris-Rips complex on image features, and define TD based on the differences in persistent-homology groups of the two manifolds. We compare TD with the most commonly used and relevant measures in the field, including Inception Score (IS), Frechet Inception Distance (FID), Kernel Inception Distance (KID) and Geometry Score (GS), in a range of experiments on various datasets. We demonstrate the unique advantage and superiority of our proposed approach over the aforementioned metrics. A combination of our empirical results and the theoretical argument we propose in favour of TD, strongly supports the claim that TD is a powerful candidate metric that researchers can employ when aiming to automatically evaluate the goodness of GANs' learning.


Efficient Topological Layer based on Persistent Landscapes

Kim, Kwangho, Kim, Jisu, Kim, Joon Sik, Chazal, Frederic, Wasserman, Larry

arXiv.org Machine Learning

We propose a novel topological layer for general deep learning models based on persistent landscapes, in which we can efficiently exploit underlying topological features of the input data structure. We use the robust DTM function and show differentiability with respect to layer inputs, for a general persistent homology with arbitrary filtration. Thus, our proposed layer can be placed anywhere in the network architecture and feed critical information on the topological features of input data into subsequent layers to improve the learnability of the networks toward a given task. A task-optimal structure of the topological layer is learned during training via backpropagation, without requiring any input featurization or data preprocessing. We provide a tight stability theorem, and show that the proposed layer is robust towards noise and outliers. We demonstrate the effectiveness of our approach by classification experiments on various datasets.


Unsupervised Space-Time Clustering using Persistent Homology

Islambekov, Umar, Gel, Yulia

arXiv.org Machine Learning

This paper presents a new clustering algorithm for space-time data based on the concepts of topological data analysis and in particular, persistent homology. Employing persistent homology - a flexible mathematical tool from algebraic topology used to extract topological information from data - in unsupervised learning is an uncommon and a novel approach. A notable aspect of this methodology consists in analyzing data at multiple resolutions which allows to distinguish true features from noise based on the extent of their persistence. We evaluate the performance of our algorithm on synthetic data and compare it to other well-known clustering algorithms such as K-means, hierarchical clustering and DBSCAN. We illustrate its application in the context of a case study of water quality in the Chesapeake Bay.